Database Selection Using Document and Collection Term Frequencies
نویسندگان
چکیده
We examine the impact of two types of information — document frequency (df) and collection term frequency (ctf) — on the effectiveness of database selection. We introduce a family of database selection algorithms based on this information, and compare their effectiveness to two existing database selection approaches, CORI [3] and gGlOSS [9]. We demonstrate that a simple selection algorithm that uses only document frequency information is more effective than gGlOSS, and achieves effectiveness that is very close to that of CORI.
منابع مشابه
Apply Uncertainty in Document-Oriented Database (MongoDB) Using F-XML
As moving to big data world where data is increasing in unstructured way with high velocity, there is a need of data-store to store this bundle amount of data. Traditionally, relational databases are used which are now not compatible to handle this large amount of data, so it is needed to move on to non-relational data-stores. In the current study, we have proposed an extension of the Mongo...
متن کاملApply Uncertainty in Document-Oriented Database (MongoDB) Using F-XML
As moving to big data world where data is increasing in unstructured way with high velocity, there is a need of data-store to store this bundle amount of data. Traditionally, relational databases are used which are now not compatible to handle this large amount of data, so it is needed to move on to non-relational data-stores. In the current study, we have proposed an extension of the Mongo...
متن کاملIntegrated Clustering and Feature Selection Scheme for Text Documents
Problem statement: Text documents are the unstructured databases that contain raw data collection. The clustering techniques are used group up the text documents with reference to its similarity. Approach: The feature selection techniques were used to improve the efficiency and accuracy of clustering process. The feature selection was done by eliminate the redundant and irrelevant items from th...
متن کاملA Scalable Architecture for XML Retrieval
Whereas in classical text collections, documents are considered as atomic units, we consider in XML collections elements in documents. This augmented view increases the number of potentially retrieved objects. A retrieved object can be a document, an element in a document, an aggregation of elements or of documents or the whole collection itself. The increase in the number of objects to be inde...
متن کاملA Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation
Abstract Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...
متن کامل